On the complexity measures of genetic sequences

نویسندگان

  • Vladimir D. Gusev
  • Lubov A. Nemytikova
  • Nadia A. Chuzhanova
چکیده

MOTIVATION It is well known that the regulatory regions of genomes are highly repetitive. They are rich in direct, symmetric and complemented repeats, and there is no doubt about the functional significance of these repeats. Among known measures of complexity, the Ziv-Lempel complexity measure reflects most adequately repeats occurring in the text. But this measure does not take into account isomorphic repeats. By isomorphic repeats we mean fragments that are identical (or symmetric) modulo some permutation of the alphabet letters. RESULTS In this paper, two complexity measures of symbolic sequences are proposed that generalize the Ziv-Lempel complexity measure by taking into account any isomorphic repeats in the text (rather than just direct repeats as in Ziv-Lempel). The first of them, the complexity vector, is designed for small alphabets such as the alphabet of nucleotides. The second is based on a search for the longest isomorphic fragment in the history of sequence synthesis and can be used for alphabets of arbitrary cardinality. These measures have been used for recognition of structural regularities in DNA sequences. Some interesting structures related to the regulatory region of the human growth hormone are reported.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Approach to Detect Congestive Heart Failure Using Symbolic Dynamics Analysis of Electrocardiogram Signal

The aim of this study is to show that the measures derived from Electrocardiogram (ECG) signals many a time perform better than the same measures obtained from heart rate (HR) signals. A comparison was made to investigate how far the nonlinear symbolic dynamics approach helps to characterize the nonlinear properties of ECG signals and HR signals, and thereby discriminate between normal and cong...

متن کامل

A New Approach to Detect Congestive Heart Failure Using Symbolic Dynamics Analysis of Electrocardiogram Signal

The aim of this study is to show that the measures derived from Electrocardiogram (ECG) signals many a time perform better than the same measures obtained from heart rate (HR) signals. A comparison was made to investigate how far the nonlinear symbolic dynamics approach helps to characterize the nonlinear properties of ECG signals and HR signals, and thereby discriminate between normal and cong...

متن کامل

Parallel Generation of t-ary Trees

A parallel algorithm for generating t-ary tree sequences in reverse B-order is presented. The algorithm generates t-ary trees by 0-1 sequences, and each 0-1 sequences is generated in constant average time O(1). The algorithm is executed on a CREW SM SIMD model, and is adaptive and cost-optimal. Prior to the discussion of the parallel algorithm a new sequential generation with O(1) average time ...

متن کامل

Genetic and Memetic Algorithms for Sequencing a New JIT Mixed-Model Assembly Line

This paper presents a new mathematical programming model for the bi-criteria mixed-model assembly line balancing problem in a just-in-time (JIT) production system. There is a set of criteria to judge sequences of the product mix in terms of the effective utilization of the system. The primary goal of this model is to minimize the setup cost and the stoppage assembly line cost, simultaneously. B...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 15 12  شماره 

صفحات  -

تاریخ انتشار 1999